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ABSTRACT 



An apparatus for correlating an unknown word with a 
plurality of dictionary words includes a first memory 
area storing data corresponding to the unknown word 
and a plurality of second memory areas respectively 
storing data corresponding to the plurality of dictionary 
words. A control mechanism coupled to the first and 
second memory areas divides the unknown word into a 
plurality of unknown word fragments and divides each 
of the plurality of dictionary words stored in the second 
memory areas into a plurality of dictionary word frag- 
ments. An arithmetic logic unit compares the unknown 
word fragments to the dictionary word fragments and 
generates an output signal indicating a hit when an 
unknown word fragment corresponds to at least one of 
the dictionary word fragments, A hit counter counts the 
number of hits. The first memory area and the plurality 
of second memory areas may be implemented using 
circular shift registers. 

9 Claims, 5 Drawing Sheets 
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A further object of the invention is to provide an 

APPARATUS FOR IDENTIFYING UNKNOWN optimized apparatus for a particular character string 

WORDS BY COMPARISON TO KNOWN WORDS analysis method. 

A correlating apparatus according to the invention, 

TECHNICAL FIELD 5 includes a first memory area storing data corresponding 

The present invention relates to an apparatus and t0 a unknown word ^ a plurality of second memory 

method for parallel character string analysis using digi- ^eas respectively storing data corresponding to one of 

tal circuitry and, more particularly, for correlating a a plurality of dictionary words. A control mechanism 

digraph corresponding to an unknown word or phrase coupled to the first and second memory areas divides 

stored in a first memory array with a plurality of di- 10 toe unknown word into a plurality of unknown word 

graphs corresponding to known dictionary words or fragments and divides each of the dictionary words 

phrases respectively stored in a plurality of second stored in the second memory areas into a plurality of 

memory arrays. dictionary word fragments. An arithmetic logic unit 

compares the unknown word fragments to the dictio- 

BACKGROUND ART 15 nary worc j fragments and generates an output signal 

Many forms, such as census forms, require an individ- indicating a hit when an unknown word fragment cor- 

ual to answer questions by hand printing a response on responds to at least one of the dictionary word frag- 

the form. Most responses include words that are chosen ments. A hit counter counts the number of hits, 

from a list that is either implicitly or explicitly defined w BRIEF DESCRIPTION OF THE DRAWINGS 
for the person filling out the form. For example, an 

implicitly defined list includes the list of Indian tribes FIG. 1 is a block diagram of an apparatus for corre- 

that appears on a United State Government census lating words and phrases according to an embodiment 

form. An explicitly defined list includes, for example, of toe invention. 

lists of various diseases that are stated on an insurance 25 FIG. 2 is a block diagram of an embodiment of the 

application form. word identifier logic unit shown in FIG. 1. 

Traditionally, the response is entered into a computer FIGS. 3 is a simplified diagram of another embodi- 
manually using a keyboard. Optical character recogni- ment of the word identifier logic unit shown in FIG. 1. 
tion (OCR) systems automatically convert these hand- FIGS. 4 is a diagram of the second embodiment of the 
printed responses into a computer-readable format 30 word identifier logic unit shown in FIG. 3. 
Identifying hand-printed words read by the OCR sys- FIG. 5 is a diagram of a shift register shown in FIGS, 
terns may be difficult because there may be a number of 3 and 4. 
spelling errors in the words. Spelling errors include 
errors made by the persons filling out the forms (e.g., 
insertion, deletion, substitution, and transposition of 35 
letters), as well as the character recognition errors of An embodiment of a correlating apparatus 1 for cor- 
the OCR system (e.g., letter substitution and segmenta- relating words or phrases is shown in FIG. 1. The cor- 
tion errors). At the present time, most errors in using relating apparatus 1 includes a CPU 2 connected to a 
state of the art OCR techniques are attributable to OCR plurality of word identifier logic units 3 and to an input- 
recognition errors and not human errors in hand- 40 /output (I/O) logic unit 4 through a system bus 5. 
printed responses. The correlating apparatus 1 may be single integrated 

For each application of the OCR techniques, there is circuit chip or may be a plurality of integrated circuit 
a maximum tolerable error rate corresponding to the chips located on the same or separate circuit boards, 
number of words which are either unidentifiable or Only a single word identifier logic unit 3 is required for 
incorrectly identified. If this maximum tolerable error 45 the operation of the invention. The CPU 2 may be, for 
rate is exceeded, then OCR cannot replace manual key- example, a separate general purpose computer, a micro- 
board entry. Currently, state of the art OCR techniques processor, or a digital control logic unit The I/O logic 
have an error rate which is too high for most applica- unit 4 may be coupled to another computer system, a 
tions. Thus, there is a need to develop better OCR scanner, or another input-output device, 
methods or to develop word identification methods that 50 Referring to FIG. 2, an embodiment of the word 
are more tolerant of the various types of errors encoun- identifier logic unit 3 includes an internal bus 12 cou- 
tered in the OCR of hand-written forms. pled to a control unit 6, an unknown word memory 

Several methods for identifying words in OCR of array 11, N known word memory arrays 7i-7at, where 

hand-written forms are disclosed in U.S. patent applica- N is an integer equal to at least 2, N arithmetic logic 

tion Ser. No. 07/911,698, now U.S. Pat. No. 5,329,598, 55 units (ALU) 8i-8#, N hit counters 9\-9m and N miss 

filed Jul. 10, for "METHOD AND APPARATUS counters IOi-IOtv. The internal bus 12 transmits and 

FOR ANALYZING CHARACTER STRINGS", receives data and controls information flow from and to 

incorporated herein by reference. This patent applica- the system bus 5. The source of information on the 

tion discloses a general purpose parallel computer for system bus 5 is not limited to the structure shown in 

implementing methods for analyzing character strings. 60 FIG. 1. The control unit 6 is coupled to the memory 

However, there is still a need for a simple, low cost arrays 7i-7jv, 11» the ALUs 8i-$at, the hit counters 

architecture that is optimized for a particular character 9i-9#, and the miss counters 10i-10;y. 

string analysis method. The unknown word memory array 11 is coupled to 

qttvtm apv tt-tp ntfVTTwnY'YM each of N ALUs, ALU 81-ALU S N to ALU*. The 

SUMMARY OF THE INVENTION 65 taov/n word meniory 7 , h coupled to ^ ALU 

The principal object of the present invention is to 8i, the known word memory array li is coupled to 
provide a simple and economical apparatus for imple- ALU 82, . . . and the known word memory array 7jv is 
menting character string analysis methods. coupled to ALU 8 at. The number of memory arrays 
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employed can vary depending on the particular use of 
the invention. 

A known word may, for example, correspond to a 
word in an electronically stored dictionary containing a 
list of recognizable words expected in response to a 5 
particular question on a form. All of the known word 
memory arrays may be loaded in parallel in a single 
operation. Parallel loading decreases the time required 
to initialize the known word memory arrays. Alterna- 
tively, known word memory arrays may be loaded 10 
serially. The number of known words contained in the 
dictionary may range, for example, between 100 and 
100,000 words. Each of the known word memory ar- 
rays 7i-7ivpreferably contains one or more of the words 
from the list of dictionary words. Each known word 15 
memory 7 contains a plurality of bits, for example, be- 
tween 10 and 1,000 bits. In a preferred embodiment, 
each of the known word memory arrays 7 i-7j\r contains 
one known dictionary word or phrase. The known 
word memory arrays may include a random access 20 
memory (RAM), a read only memory (ROM), an elec- 
trically erasable programmable read only memory (EE- 
PROM), or other types of memories for storing data 
electronically. 

An unknown word may, for example, correspond to 25 
a series of characters produced by OCR software. The 
unknown word memory array 11 may contain one or 
more unknown words, for example a phrase. In a pre- 
f erred embodiment, the unknown word memory array 
11 contains one unknown word. The unknown word 30 
contains a plurality of bits, for example, between 10 and 
1,000 bits. The unknown word memory array 11 prefer- 
ably contains the same number of bits as contained in 
each of the known word memory arrays 7\-1n. The 
unknown word memory array 11 is preferably a reada- 35 
ble/writable memory. 

Unknown words and control information are prefera- - 
bly received from the system bus 5. A plurality of bit 
patterns corresponding to known dictionary words may 
be received from the system bus 5 and stored in the 40 
known word memory arrays 7i-7at. If the known words 
are received from the system bus 5, in a preferred em- 
bodiment, updating of the known words is only per- 
formed when changing dictionaries or when power is 
first applied to the correlating apparatus 1, each time 45 
the apparatus is powered up. Alternatively, the known 
word memory arrays may be preprogrammed 

with a plurality of bit patterns corresponding to known 
dictionary words. 

In operation, the structures FIGS. 1 and 2 is are capa- 50 
ble of implementing the methods disclosed in U.S. pa- 
tent application Ser. No, 07/91 1,698, filed Jul. 10, 1992, 
for "METHOD AND APPARATUS FOR ANA- 
LYZING CHARACTER STRINGS" in a highly effi- 
cient manner using serial/parallel operations. By way of 55 
illustration and not limitation, the operation of the 
structures of FIGS. 1 and 2 will be discussed with re- 
gard to a single example of operation. 

An unknown word received from the system bus 5 is 
stored in the unknown word memory array 11. As dis- 60 
cussed above, a plurality of known words has been 
previously stored in the known word memory arrays 
77 1-7 at. Each of the known words in the known word 
memory arrays 1\-1n is compared to the unknown 
word in the unknown memory array 11 through one or 65 
more of the ALUs %\-%n- The comparison may include 
such logical operations as OR, NAND, addition, sub- 
traction, inversion, etc. The results of the comparison 
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are stored in one of the hit counters 9i-9//or one of the 
miss counters IOi-IOat. Alternatively, the results of the 
comparison may be directly output to the internal bus 
12. In a preferred embodiment, the comparison includes 
logical AND and XOR functions. 

When comparing the contents of a first memory array 
with the contents of a second memory array, in an 
AND comparison, a "hit" occurs whenever both the 
first and second memory arrays contain a logic "1" at 
identical bit address locations within each memory ar- 
ray. A "hit count" is defined as the number of hits that 
occur when the contents of the first memory array are 
compared with the contents of the a second memory 
array. The logic "1" bits produced as a result of the 
logic AND comparison are indicative of the number of 
hits or agreements that occurred in the comparison. 
One of the hit counters 9i -9 n counts the number of hits 
(hit count) that occur as a result of a particular compari- 
son. When comparing a first memory array with a sec- 
ond memory array in a logical XOR operation, a "miss" 
occurs whenever the first and second memory arrays 
contain opposite logic values (i.e., logic "0" and "1") for 
an identical bit address location within each memory 
array. A "miss count" is defined as the number of misses 
that occur when the contents of a first memory array 
are compared to the contents of a second memory ar- 
ray. 

A logical XOR of one of the plurality of known word 
memory arrays 7i-7# with the unknown word memory 
array 11 produces a logic "1" bit, in this case a miss, 
whenever identical bit address locations within each of 
the memory arrays contain opposite logic levels (i.e., 
logic "1" and logic "0"). The logic "1" bits produced as 
a result of the logic XOR comparison are indicative of 
the number of misses that occurred in the comparison. 
The corresponding miss counters 10i-10/v counts the 
number of misses (miss count) that occur for the respec- 
tive known word memory arrays 

The control logic unit 6 may be utilized to output 
data stored in each hit counter 9i-9jvand miss counter 
10i-10;y in response to address signals on the system bus 
5. Alternatively, the control logic unit 6 may make 
calculations from the results output from the counters 
9i-9/v, 10i-10/fto determine which known word has the 
greatest likelihood of corresponding to the unknown 
word. By performing this calculation internally, only a 
subset of values, corresponding to those known words 
having a high level of confidence, are output from the 
word identifier logic unit 3. A preferred calculation 
method for determining a confidence factor is: 



CONFIDENCE FACTOR = 1 - ( -^ff^ )• 

It may be preferable to incorporate a plurality of 
word identifier logic units 3 into the correlating appara- 
tus 1. In this manner, a single dictionary of known 
words may be divided among a plurality of word identi- 
fier logic units 3. Alternatively, a different word identi- 
fier logic unit 3 may be programmed with a different 
dictionary corresponding to each question on a form. 
Thus, the structure of in FIGS. 1 and 2 enables process- 
ing of unknown words at a rate that is independent of 
the number of known words in the dictionary and inde- 
pendent of the number of questions on a form under 
consideration. 
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The particular implementation of the word identifier shifted one bit and a logic 1 is output from the AND 
logic unit 3 may vary depending on a particular applica- gate 23. The 10-bit miss counter is incremented each 
tion. For applications that require a high speed and have time the shift registers 21 and 22 are shifted one bit and 
a relatively limited number of known dictionary words, a logic 1 is output from the XOR gate 24. 
it may be preferable to include the entire correlating 5 FIG. 3 shows a single known word shift register 22 
apparatus on a single semiconductor chip. It is well operating in conjunction with a single unknown word 
known in the art that calculations on a single semicon- shift register 21. As is apparent from FIG. 2, a single 
due tor substrate can occur at a higher rate than calcula- unknown shift register 21 may be used in combination 
tions carried out over a plurality of semiconductor sub- with N known word shift registers 22i-22jv. Such a 
strates. 10 configuration is schematically illustrated in FIG. 4. 

FIG. 3 is a simplified diagram of another embodiment FIG. 4 is a detailed diagram of the word identifier 
of the word identifier logic unit 3. An unknown word logic unit 3 of FIG. 3. A system bus 5 carries various 
shift register 21 preferably includes a 784 bit shift regis- signals including a write enable (We), data signals 
ter having addresses (A0-A9), and separate data input (D0-D31), address signals (A0-A16), a clock signal 
(D) and a data output (OUT) path. Similarly, a known 15 (CLK), a read signal (Rd and an interrupt signal (IN- 
word shift register 22 preferably includes a 784 bit shift TERRUPT). 

register having address signals (A0-A9), a data input Address signals A10-A16 are input into a 7 to 128 bit 
signal (D), and a data output signal (OUT) path. Al- decoder 30. The first 127 outputs from the 7 to 128 bit 
though the known word shift register 21 and the un« decoder 16 are respectively coupled to corresponding 
known word shift register 22 are preferably 784 bits in 20 AND gates of a plurality of counter enable AND gates 
length, other suitable bit lengths may also be utilized. 31i-31#and to respective chip enable inputs, CeO-CeN, 
The shift registers 21 and 22 may be operated in either of the shift registers 21 and 22i-22iv. The read signal Rd 
a random access mode or a shift register mode. is connected to a second input of each of the plurality of 

When operated in the random access mode, each of counter enable AND gates 31j-31at. 
the shift registers 21 and 22 may be viewed as a two-di- 25 The outputs from the counter enable AND gates 
mensional array with address signals A0-A4 accessing 31i-31#are connected to an enable input (G) of a corre- 
address bits within a row (column address) and address sponding one of 10-bit hit counters 26i-26jv and to an 
signals A5-A9 accessing an individual row of address enable input (G) of a corresponding one of 10-bit miss 
bits (row address). However, the invention is not lim- counters 27i-27jy. 

ited to this configuration. There may be applications 30 When N is less than 127, the I28th output (Ce 127) 
where it may be advantageous to construct the shift from the 7 to 128 bit decoder 30 is coupled to a 5-bit 
register as a one-dimensional array. When operated in register 32. (The invention is not restricted N<127. If 
the random access mode, each bit is randomly writable an embodiment is constructed on a single semiconduc- 
in response to an appropriate address signal. tor chip, the available area may limit N to a number that 

When operated in the shift register mode, each of the 35 may be greater than or less than 127.) The 5-bit register 
shift registers 21 and 22 may connect the output of a bit 32 also receives data signals D0-D4 and the We signal 
in Row N, column M-l, with the input of a bit in Row from the system bus 5. The 5-bit register 32 preferably 
N, column M. The output of a bit in the last column outputs a load signal (LOAD), a clear signal (CLR), a 
address of row N-l is preferably connected to the input data select signal (DS), a zero 1 signal (ZEROl), and a 
of a bit in the first column address of row N. The output 40 zero 2 signal (ZER02). 

of a bit in the last column address of the last row (row Address signals A0-A9 are connected to address 
N) may be connected to the input of a bit in the first inputs of each of the shift registers 21 and 22i-22#. The 
column address of the first row (row 1). Thus, when output signals (OUT) from each of the shift registers 21 
operated in the shift register mode, the shift registers 21 and 22i-22jyare coupled to a data input (D) of the same 
and 22 operate as conventional 784 bit circular shift 45 shift register. The output signal OUT from the un- 
registers. known word shift register 21 is connected to an input of 

The outputs (OUT) from the unknown word shift each of a plurality of hit AND gates 23i-23^ and to an 
register 21 and from the known word shift register 22 input of each of a plurality of miss XOR gates 24i-24#. 
are input into two inputs of a hit AND gate 23. Simi- Outputs from each of the plurality of known word shift 
larly, the outputs (OUT) from the unknown word shift 50 registers 22i-22jvrare connected to the input of a co re- 
register 21 and the known word shift register 22 are sponding one of the plurality of hit AND gates 23 i-23at 
input into two inputs of a miss XOR gate 24. In the and a corresponding one of the miss XOR gates 
embodiment of the invention illustrated in FIG. 3, an 24 1 -24 at. 

arithmetic logic unit 25 comprises the AND gate 23 and The zero 1 signal is connected to the zero input of the 
the XOR gate 24. A 10 bit hit counter 26 is coupled to 55 unknown word shift register 21 The zero 2 signal is 
the output of the AND gate 23, and a 10-bit miss connected to the zero input of each of the known word 
counter 27 is coupled to the output of the XOR gate 24. shift registers 22i-22jv. 

In operation, a bit pattern corresponding to an un- The outputs of the hit AND gates 23 1 -23 at are con- 
known word is stored in the unknown word shift regis- nected to one of the inputs of a corresponding one of 
ter 21 and a bit pattern corresponding to a known word 60 clock AND gates 33i-33/m. Another input of each of 
is stored in the known word shift register 22 by select- the clock AND gates 33i-33at^ is connected to the 
ing appropriate addresses (A0-A9). The bit pattern output of a clock enable AND gate 34 through an in- 
corresponding to the unknown word is shifted out of verter 35. The outputs of the clock AND gates 
the unknown word shift register 21 in a serial format 33] ^-33 aw are connected to corresponding ones of the 
Similarly, a bit pattern corresponding to the known 65 10-bit hit counters 26\-26tf. The outputs of the XOR 
word is simultaneously shifted out of the known word gates 24i-24a/ are connected through respective clock 
shift register 22 in a serial format The 10-bit hit counter AND gates 33i£-33a® to the corresponding ones of 
is incremented each time the shift registers 21 and 22 are 10-bit miss counters 27i-27at. Likewise, the clock en- 
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able AND gate 34 output is connected through the OUT from the Nth memory cell 1783 is connected to 

inverter 35 to one of the inputs of each of the XOR the data input D of the first memory cell 1000, thereby 

gates 24i-24#. forming a circular shift register. As discussed above 

The clock enable AND gate 34 receives the clock with reference to FIG. 3, the output of each memory 

signal CLK from the system bus 5 at a first input, a data 5 cell within the circular shift register is connected to the 

select signal DS from the 5-bit register 32 at a second input of the next memory celL 

input, and an interrupt signal INTERRUPT from a The second memory cell 1001 of FIG. 5 is con- 
carry output of a presettable 10-bit counter 36 at a third structed in a similar fashion as the first memory cell 
input. The clock enable AND gate 34 outputs a con- 1000, The second memory cell 1001 differs from the 
trolled clock signal CCLK to the input of clock inverter 10 first memory cell 1000 in that the second input of data 
35, a clock input of the presettable 10-bit counter 36, select OR gate 4*1 of the second memory cell 1001 is 
and clock input of each of the shift registers 21 and connected to an output Q of the D flip/flop 42o of the 
22i-22jy. first memory cell 1000. Additionally, a different 5-bit 
The presettable 10-bit counter 36 has a clear input X-digraph decode signal is input into the first input of 
connected to the clear signal CLR of the 5-bit register 15 AND gate 43i. Memory cells 1002 through 1783 are 
32, a load input connected to the load signal LOAD of similarly constructed with the exception that each 
the 5-bit register 32, and a 10-bit data port connected to memory cell receives a different X-Y digraph input 
various combinations of logic 0 and logic 1. The inter- combination. 

rupt signal INTERRUPT produced by the carry output In operation, the embodiment of the word identifier 
of the presettable 10-bit counter 36 is also connected to 20 logic 3 of FIGS. 3-5 performs a similar correlation 
the interrupt signal INTERRUPT of the system bus 5. function as the embodiment of FIGS. 1 and 2. The 
Each of the 10-bit hit counters 2$\-26n and each of invention is not limited to the particular embodiment of 
the 10-bit miss counters 27i-27whas a clear input con- FIG. 1 for supplying the appropriate address and con- 
nected to the clear signal CLR from the 5-bit register trol signals across the system bus 5 or to the particular 
32. Each of 10-bit hit counters 26i-26at has a 10-bit data 25 configuration of the control circuitry disclosed in 
output connected to data signals D0-D9 of the system FIGS. 4 and 5, 

bus 5. Each of the 10-bit miss counters 27 1 -27 at has a The CPU 2 of FIG. 1 supplies various control, data, 

10-bit data output connected to data signals D16-D25 and address signals to the word identifier logic unit 3 

of the system bus 5. shown in FIG. 4. For example, the CPU 2 can store 

FIG. 5 is a detailed diagram of a preferred embodi- 30 information into the 5-bit register 32 by outputting a 

ment of one of the shift registers 21 and 22i-22^shown logical address of 1 1 1 1 1 1 1 in binary code on address 

in FIGS. 3 and 4. Referring to FIG. 5, addresses signals lines A10-A16 in conjunction with a write signal We. 

A0-A4 from the system bus 5 are input into a 5 to 32 bit Data appearing on data lines D0-D4 are then stored in 

X-digraph decoder 40, and addresses A5-A9 from the respective memory bits of the 5-bit register 32 (ZER02, 

system bus 5 may be input into a 5 to 32 bit Y-digraph 35 ZEROl, DS, CLR, LOAD). Storing a logic 0 in bit 

decoder 41. The unknown word shift register 21 and the location 0 causes the zero 2 signal ZER02 to be at a 

plurality of known word shift register 22i-22at each logic 0 level. Storing a logic 1 in bit location 0 causes 

contain 784 identical memory cells (1000-1783). A first the zero 2 signal, ZER02, to be at a logic 1 level The 

memory cell 1000 may include a D flip/flop 42o, a other bits in the 5-bit register operate similarly. For 

memory cell select AND gate 43a an inverter 44a a 40 example, storing a logic 0 in bit location 5 causes the 

clock select OR gate 45o, and a data select OR gate 46q. load signal LOAD to be at a logic 0 level. Storing a 

The first memory cell 1000 may include a select logic 1 in bit location 5 causes the load signal LOAD to 
AND gate 43o having a first input connected to one of be at a logic 1 level. The actual bit length of the 5-bit 
the chip enable output signals CeO-Cel27 of the 7 to register 32 is unimportant. It may be convenient to 
128 bit decoder 30, a second input connected to the 45 implement this register with a bit length greater than 5 
output signal XO of the 5 to 32 bit X-digraph decoder bits leaving the remaining bits unused. 
40, a third input connected to the output signal YO of The zero 2 signal ZER02 is connected to a clear 
the 5 to 32 bit Y-digraph decoder 41, and a fourth input input of each of the D flip/flops 42o-42 7 g3 of the plural- 
connected to the write enable signal We from the sys- ity of known word shift registers 22i-22j\r. When the 
tern bus 5. The clock select OR gate 45p has a first input 50 zero 2 signal ZER02 is brought to a logic 1 level, all of 
connected to the controlled clock signal CCLK and a the known word shift registers 22i-22Arare cleared (i.e., 
second input connected to the output of the memory a logic 0 level is stored into all memory cells 1000 
cell select AND gate 43a The clock select OR gate through 1783 of each known word shift register 
45o has an output connected to a clock input CLK of 22i-22yv). When the zero 2 signal ZERO 2 is brought to 
the D flip/flop 42a A clear input of D flip/flop 42o is 55 a logic 0 level, the functioning of the D flip/flops 42c? 
connected to either the zerol or zero2 signal output -42783 are not affected. 

from the 5-bit register 32. The zero 1 signal ZEROl is connected to a clear 

The input of the inverter 44<?is connected to the data input of each of the D flip/flops 42<? -42783 of the un- 

select signal DS of the 5-bit register 32. The data select known word shift register 21. When the zero 1 signal 

OR gate 46o has an output connected to a D input of D 60 ZEROl is brought to a logic 1 level, the unknown word 

flip/flop 42a The data select OR gate 46 0 has a first shift register 21 is cleared a logic 0 level is stored 

input connected to the output of inverter 44o and a into all memory cells 1000 through 1783 of the un- 

second input connected to an output Q of the D flip/- known word shift register 21. When the zero 1 signal 

flop 42783 in a preceding memory cell. As discussed ZEROl is brought to a logic 0 level, the functioning of 

above with reference to FIG. 4, the data input D of 65 the D flip/flops 42o -42 7 g3 is not affected, 
each of the plurality of shift registers 21 and 22j-22n is As shown in FIG. 4, the clear signal CLR of the 5-bit 

respectively connected to the data output OUT of the register 32 is connected to a clear input of each of the 

same shift register. Thus, in FIG. 5, the data output plurality of 10 bit hit counters 26i-26^, each of the 



07/27/2004, EAST Version: 1.4.1 



5,392, 

plurality of 10-bit miss counters 27i-27iv> and the preset- 
table 10-bit counter 36. The clear signal CLR clears 
each of the 10-bit counters 36, 26i^26m and 27i-27jv- 
When the clear signal CLR is brought to a logic 1 level, 
each of the 10-bh counters 36, 26i-26#, and 27i-27wis 5 
cleared (Le., a logic 0 level is stored into all bit loca- 
tions). When the clear signal CLR is brought to a logic 
0 level, the functions of the 10-bit counters 36, 26\-26& 
and 27 1-27 are not affected. 

A load signal LOAD loads a preset value into the 10 
presettable 10-bit counter 36. When the load signal 
LOAD is brought to a logic 1 level, the presettable 
10-bit counter 36 is loaded with the value of OFOH. 
When the load signal LOAD of the 5 bit register 32 is 
brought to a logic 0 level, the presettable 10-bit counter 15 
36 is not affected and counting can begin. 

The data select signal DS places the shift registers 21 
and 22|-22;v in either the random access mode or the 
shift register mode. When the data select signal DS of 
the 5-bit register 32 is at a logic 0 level, the shift regis- 20 
ters 21 and 22i-22// operate in the random access mode. 
The logic 0 level of the data select signal DS is inverted 
by inverter 44/supplied to the data input D of D flip/- 
flop 42jthrough the data select OR gate 46,. Thus, each 
D flip/flop 42/ in every memory cell in every shift regis- 25 
ter 21 and 22i-22./vhas a data input D set to a value of 
logic 1. 

Referring to FIGS. 4 and 5, 5 to 32 bit X-digraph 
decoder 40 decodes address bits A0-A4, 5 to 32 bit 
Y-digraph decoder 41 decodes address bits A5-A9, and 30 
7 to 128 bit decoder 30 decodes address bits A10-A16. 
The 5 to 32 bit X-digraph decoder 40, 5 to 32 bit Y- 
digraph decoder 41, and the 7 to 128 bit decoder 30 
respectively output a plurality of decoded signals 
Xo-X/n, Yo-Y m , and Ceo-CEi27- A different combina- 35 
tion of these decoded signals Xo-X m , Yo-Y m , and 
Ceo-Ce/27 are input into each memory cell select AND 
gate 43 so that a logic 1 may be selectively written into 
each memory cell within the shift registers 21 and 
22i-22jv . 40 

In the random access mode, a logic 1 may be written 
into any selected memory cell by applying an appropri- 
ate address on address signals A0-A16 while simulta- 
neously actuating the write enable We from the 5-bit 
register 32 signal. For example, to store a logic 1 into 45 
memory cell location 1000 of the unknown word shift 
register 21, the central processing unit 2 would output a 
write enable signal We in conjunction with an address 
having a value of 00000H. To store a logic 1 into mem- 
ory cell location 1001 of the unknown word shift regis- 50 
ter 21, the central processing unit 2 would output an 
address having a value of 0000 1H in conjunction with a 
write enable signal We. To store a logic 1 into the first 
memory cell location of the first known word shift 
register 22 1, the central processing unit 2 would output 55 
an address having a value of 004O0H in conjunction 
with a write enable signal We. 

When operated in the random access mode, each shift 
register 21, 22i-22jv may be viewed as a two-dimen- 
sional array with address signals A0-A4 accessing ad- 60 
dress bits within a column (X0— 27 column addresses), 
and address signals A5-A9 accessing an individual row 
of address bits (Y0-Y27 row address). In this mode, 
each bit within each shift register 21 and 22\-22^ is 
randomly writable by applying an appropriate address 65 
and control signals from the system bus 5. 

In a preferred mode of operation, it may be desirable 
to utilize add r ess signals A0-A4 as a first letter bus and 



10 

address signals A5-A9 as a second letter bus. In the 
general case, each letter bus would have K lines. In the 
preferred embodiment, each letter bus has 5 lines that 
are used to encode a single ASCII symbol such as a 
particular letter. The hex codes for the ASCII symbols 
@ through [range from 20 H through 3C H. It is appar- 
ent to those skilled in the art that only the five lower 
order bits of the ASCII code are required to uniquely 
identify the subset of ASCII symbols falling within the 
range of 20-3C H. Thus, it is desirable to identify these 
ASCII codes with an input to the digraph decoders 
ranging from 0 through 3C H. 

In the random access mode, the 5 to 32 bit X-digraph 
decoder 40 and the 5 to 32 bit Y-digraph decoder 41 are 
standard demultiplexers with only the first 28 output 
bits utilized. In the general case, the decoders 30, 40, 41 
may have any number of inputs and corresponding 
outputs. Truth tables for the operation of the 5 to 32 bit 
X-digraph decoders 40 and the Y-digraph decoder 41 
are respectively shown in Tables 1 and 2. 

TABLE 1 

5 to 32 Bit X-digraph Decoder Troth Table 



INPUT 





AO 


Al 


A2 


A3 


A4 




OUTPUT 














@-X0 


0 


0 


0 


0 


0 


@ = 1 (all others 0) 


A - XI 


0 


0 


0 


0 


1 


A=l (all others 0) 


B-X2 


0 


0 


0 




0 


B= 1 (all others 0) 


Y-X25 


1 


1 


0 




1 


Y=l (all others 0) 


Z-X26 


1 


1 


0 




0 


Z«l (all others 0) 


[-X27 


1 


1 


0 




1 


[=1 (all others 0) 


TABLE 2 








INPUT 










AS 


A6 


A7 


A8 


A9 




OUTPUT 














@-Y0 


0 


0 


0 


0 


0 


@ = t (all others 0) 


A-Yl 


0 


0 


0 


0 


1 


A=l (all others 0) 


B • Y2 


0 


0 


0 




0 


B=l (all others 0) 


Y - Y25 


1 


1 


0 




1 


Y=l (aD others 0) 


Z- Y26 


1 


1 


0 




0 


Z- \ (all others 0) 


[-Y27 


1 


1 


0 




1 


[=1 (all others 0) 



In a preferred embodiment, digraphs are formed from 
the known word and each of the unknown words. The 
digraphs are stored in the shift registers 21 and 22i~22at. 
Each digraph is formed from the 26 letters of the En- 
glish language alphabet plus a word-start marker @ and 
a word-end marker [. The 5 to 32 bit demultiplexers 40 
and 41 have 28 output lines respectively labeled 
X0-X27 and Y0-Y27 to correspond to ASCII @ 
through [. 

In a preferred method, a digraph corresponding to a 
particular unknown word is formed in the unknown 
word shift register 21 as follows. The unknown word 
shift register is cleared using the zero 1 signal ZEROl 
from the 5 bit register 32 as discussed above. The data 
select signal DS is set to a logic 0 so that the shift regis- 
ters 21 and 22|-22jvare operated in the random access 
mode. The central processing unit 2 reads the fust un- 
known word from a memory location or the I/O device 
4. 



07/27/2004, EAST Version: 1.4.1 



5,392,212 

11 12 

The central processing unit 2 sets address signal lines performed in a bit serial manner by shifting the contents 

A0-A4 equal to the lower order 5 bits in the ASCII @ of each of the shift registers 21 and 22i-22/sra complete 

character signifying the start of a word, address signals rotation until all of the bits have been output through 

A5-A9 equal to the lower order 5 bits of the first letter the shift register data output OUT. 

of the current unknown word, and address signals Al- 5 Prior to initiating the shift register mode, the 10-bit 

0-A16 equal to the address of the unknown word shift hit counters 26i-26w, the 10-bit miss counters 27i-27/v 

register 21 (i.e., 00H). The central processing unit then and the 10-bit counter 36 are cleared, and the presetta- 

outputs a write enable signal We to write a logic 1 into ble 10-bit counter 36 is preset to 040H. The shift register 

the selected memory cell location in the unknown word mode is initiated by bringing the data select signal DS to 

shift register 21. 10 a logic 1 level. With reference to FIG. 5, the data select 

Next, the central processing unit 2 sets address signal signal DS is inverted by the inverters 44j-44;vso that a 

lines A0-A4 equal to the value of the address signals logic 0 is input into the first input of data select OR 

A5-A9 used in the previous comparison. Address sig- gates 46i-46#. In this configuration, each D flip/flop 

nals A5-A9 are then set equal to tie lower order 5 bits 422-42/v receives, at its data input D, a value equal to 

ofthe next letter of the current unknown word. Address 15 the output of the preceding D flip/flop 42i-42^.i. As 

signals A1Q-A16 remain equal to the address of the previously discussed, the output from the last D flip/- 

unknown word shift register 21 (i.e., 00H). The central flop is coupled to the input of the first D flip/flop com- 

processing unit then outputs a write enable signal We to pleting the circular configuration, 

write a logic 1 into the selected memory cell location in When the data select signal DS is changed to a logic 

the unknown word shift register 21. This process is 20 1 value, the controlled clock signal CCLK is enabled by 

repeated until address signals A5-A9 are set equal to the the clock enable AND gate 34. The controlled clock 

end of word ASCII symbol [. signal CCLK is inverted by clock inverter 35 and input 

In a preferred embodiment, a digraph corresponding into a plurality of clock AND gates 33^-33^ and 

to a particular known word is formed in a similar man- tt\s-3$NS> 

ner as described above for the unknown word. When 25 The current shift register data bit appearing at the 

writing a new dictionary containing a plurality of output OUT of the unknown word shift register 21 is 

known words, each of the known word shift registers input to the first input of the plurality of hit AND gates 

22i-22/vis cleared using the zero 2 signal ZER02 from 23i-23;vand input into the first input of the miss XOR 

the 5 bit register 32 as discussed above. The data select gates 24i-24/v. The current shift register data bits ap- 

signal DS is set to a logic 0 so that the shift registers 21 30 pearing at the output OUT of each of the known word 

and 21i-22/vare operated in the random access mode. shift registers 22i-22/vr are respectively input into the 

The central processing unit 2 reads the first known second input of the plurality of hit AND gates 23 i-23/v 

word from a memory location or the I/O device 4. The and input into the second input of the plurality of miss 

known words are preferably only loaded once, upon XOR gates 24i-24/y. The hit AND gates 23i-23;vhave 

start-up or when changing dictionaries. 35 an output equal to a logic 1 whenever both the con- 

The central processing unit 2 then sets address signal nected shift registers input a logic 1 into a respective hit 

lines A0-A4 equal to the .Lower order 5 bits in the AND gate 23/. The occurrence of a logic 1 at the output 

ASCII @ character signifying the start of a word, ad- of a hit AND gate 23,- indicates that a hit, Le. match, has 

dress signals A5-A9 equal to the lower order 5 bits of occurred. The corresponding clock AND gate 33^ 

the first letter of the current known word, and address 40 cooperates with the inverted clock control signal 

signals A10-A16 equal to one of the addresses of the CCLK and the output from the hit AND gate 23,to 

known word shift registers 22i-22jy(i.e., 01H through increment the respective 10-bit hit counter on a falling 

7EH). The central processing unit then outputs a write edge of the controlled clock signal CCLK each time 

enable signal We to write a logic 1 into the selected that a hit occurs. 

memory cell location in the known word shift register 45 The miss XOR gates 24i-24Arhave an output equal to 
22i-22tf. a logic 1 whenever identical bit address location cur- 
The central processing unit 2 then sets address signal rently being received at the input of a particular miss 
lines A0-A4 equal to the value of the address signals XOR gate 24, have opposite logic levels 0-c. logic "1" 
A5-A9 used in the previous comparison. Address sig- and logic "0"). The occurrence of a logic 1 at the output 
nals A5-A9 are then set equal to the lower order 5 bits 50 of a miss XOR gate 24/indicates that a miss has oc- 
of the next letter of the current known word. Address curred. The plurality of clock AND gates 33^-332N 
signals A10-A16 remain equal to the address of the cooperate with the inverted clock control signal CCLK 
selected known word shift register 22/. The central and the output from the miss XOR gates 15 to mere- 
processing unit then outputs a write enable signal We to ment a respective 10-bit miss counters 27/ on the falling 
write a logic 1 into the selected memory cell location in 55 edge of the controlled clock signal CCLK each time 
the known word shift register 22,. This process is re- that a miss occurs. 

peated until address signals A5-A9 are set equal to the In FIG. 5 the control clock signal CCLK is coupled 

end of word ASCII symbol [. to the clock input CLK of each of the D flip/flops 

The above described process is repeated until a di- 42i-42# through a respective clock select OR gate 

graph corresponding to a known word is stored in each 60 45 1 -45 at. The rising edge of the control clock signal 

of the known word shift registers. 22i-22/y. Once all of CCLK shifts a new data bit value into the memory cell 

the shift registers 21 and 22i-22/srhave been loaded with location output from the data out OUT signal from each 

an appropriate digraph, the word identifier logic may of the shift registers 21 and 22i-22/v. The rising edge of 

be placed in a shift register mode. The shift register the control clock signal CCLK also increments the 

mode is preferably used to compare the unknown word 65 presettable 10-bit counter 36. The falling edge of the 

digraph loaded into the unknown word register 21 with control clock signal CCLK increments each of the 

each of the known word digraphs loaded into the 10-bit hit counters 26\-26s provided a hit is indicated 

known word registers 22i-22^. The comparison may be by the corresponding hit AND gate and increments 
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each of the plurality of 10-bit miss counters 27i-27# 
provided a miss is indicated by the corresponding miss 
XOR gate. When the presettable 10-bit counter 36 has 
been incremented 784 times, the carry output signal 
CARRY will be activated. The number of times the 5 
presettable 10-bit counter 36 increments before the 
carry output signal CARRY is activated is determined 
by the value loaded by the load signal LOAD. Activa- 
tion of the carry output activates the interrupt signal 
INTERRUPT. The INTERRUPT signal informs the 10 
central processing unit 2 via the system bus 5 that all bits 
within the shift registers 21 and 22i-22# have been 
compared. The interrupt signal INTERRUPT is also 
input into the clock enable AND gate 34 and disables 
the controlled clock, preventing any further shifting of 15 
the shift registers 21 and 22i-22^or incrementing of the 
10-bit counters 36, 26i-26/\r, and 27 j -27 at. 

In the above manner, the bit pattern corresponding to 
an unknown word is shifted out from the unknown 
word shift register 21 in a serial bit format Similarly, bit 20 
patterns corresponding to a plurality of known words 
are simultaneously shifted out from the known word 
shift registers 22i-22jvin a serial bit format. A particular 
10-bit hit counter 26/ is incremented each time the shift 
registers 21 and 22i-22jv are shifted and a logic 1 is 25 
output from one of the hit AND gates 23/. A particular 
10-bit miss counter 27/ is incremented each time the shift 
registers 21 and 22i-22/v are shifted and a logic 1 is 
output from a corresponding XOR gate 24/, When all 
784 bits have been shifted out from the shift registers 21 30 
and 22i-22/v, the 10-bit hit counters -26 i-26/v and 10-bit 
miss counters 27i-27jv respectively contain the number 
of hits and misses that have occurred in the comparison 
of each known word with the unknown word. 

The contents of each of the respective 10-bit hit 35 
counters 26\-26n and 10-bit miss counters 27i-27/vare 
individually accessible by the central processing unit 2 
by outputting an appropriate address on address signals 
A10-A16 in conjunction with the read enable Rd signal. 
For example, to obtain the results of the comparison 40 
between the unknown word and the first known word 
stored in known word 1 shift register, the central pro- 
cessing unit would output a logic 00H on address signals 
A 10- A 16 in conjunction with a read signal. In response, 
the first 10-bit miss counter 26i outputs a 10 bit counter 45 
value onto data signal lines D0-D9, and the first 10-miss 
counter 27i outputs a 10-bit counter value onto data 
signal lines D16-D25. The central processing unit 2 
then increments the address output on address signals 
A1Q-A16 until all 10-bit hit counters 26i and all 10-bit 50 
miss counters 27 1 have been read. After all the 10-bit hit 
and miss counters 26i-26jyand 27i-27at have been read, 
the central processing unit 2 may clear all 10-bit count- 
ers 36, 26i-26/vr and 27i-27jv by activating the clear 
signal CLR. 55 

I claim: 

1. An apparatus for comparing an unknown word to 
a plurality of known, dictionary words to identify the 
unknown word comprising: 

a first memory area storing data corresponding to an 60 
unknown word; 

a plurality of second memory areas, each second 
memory area respectively storing data correspond- 
ing to one of a plurality of known, dictionary 
words; 65 

control means coupled to the first memory area and 
the second memory areas including an unknown 
word decoder for decoding each of the unknown 



words and for forming unknown word digraphs 
respectively representing the plurality of unknown 
words, the control means storing the unknown 
word digraphs in the first memory area and includ- 
ing a dictionary word decoder for decoding each 
of the plurality of dictionary words for each of the 
known, dictionary words and for forming a plural- 
ity of dictionary word digraphs from the known, 
dictionary words respectively representing the 
known, dictionary words, and the control means 
respectively storing each of the plurality of dictio- 
nary word digraphs in the plurality of second mem- 
ory areas; 

an arithmetic logic unit coupled to the first memory 
area and the second memory areas for comparing 
the unknown word digraphs to the dictionary 
word digraphs and generating an output signal 
indicating a hit when an unknown word digraph 
corresponds to at least one of the dictionary word 
digraphs; and 

a hit counter for counting and indicating the number 
of hits. 

2. The apparatus of claim 1 wherein the arithmetic 
logic unit generates an output signal indicating a miss 
when an unknown word does not correspond to at least 
one of the dictionary words and including a miss 
counter for counting and indicating the number of 
misses. 

3. The apparatus of claim 2 wherein the arithmetic 
logic unit includes an AND gate for determining a hit 
count for each of the known, dictionary words and the 
miss counter includes an XOR gate for determining the 
miss count for each of the known, dictionary words. 

4. The apparatus of claim 3 wherein the arithmetic 
logic unit includes a plurality of AND gates and a plu- 
rality of XOR gates operating in parallel for simulta- 
neously determining the hit and miss count for each of 
the known, dictionary words. 

5. The apparatus of claim 1 wherein the first memory 
area and each of the plurality of second memory areas 
include circular shift registers. 

6. The apparatus of claim 2 wherein the control 
means includes means for determining a confidence 
factor associated with each known, dictionary word 
equal to: 



( MISSES ^ 

hits y 



(1) 



FACTOR. 



7. An apparatus for comparing an unknown word to 
a plurality of known dictionary words to identify the 
known word comprising: 

a plurality of known word circular shift registers 
respectively storing one of a plurality of known 
word digraphs, each known word corresponding 
to one of a plurality of known, dictionary words; 

an unknown word circular shift register storing data 
corresponding to an unknown word; and 

control means for decoding each of the unknown 
words to form an unknown word digraph, for stor- 
ing the unknown word digraph in the unknown 
word circular shift register, and for simultaneously 
shifting the unknown word circular shift register 
and each of the plurality of known word circular 
shift registers to compare the unknown word di- 
graph to the known word digraphs. 
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8. The apparatus of claim 7 comprising: 
an arithmetic logic unit including a plurality of logic 
AND gates coupled to the unknown word circular 
shift register and the known word circular shift 5 
registers for simultaneously comparing a plurality 
of bits respectively output from the plurality of 
known word circular shift registers with a bit out- 
put from the unknown word circular shift register 
and generating an output signal indicating a hit 10 
when bits for each known word circular shift regis- 
ter correspond to bits in the unknown word circu- 
lar shift register; and 
a counter for counting and indicating the number of 15 
hits. 



9. The correlating apparatus of claim 8 wherein the 
arithmetic logic unit includes a plurality of logic XOR 
gates coupled to the unknown word circular shift regis- 
ter and respectively coupled to the plurality of known 
word circular shift registers for determining a miss 
count for each known word digraph of the plurality of 
unknown word digraphs by simultaneously comparing 
a plurality of bits respectively output from the plurality 
of unknown word circular shift registers with a bit 
output from the unknown word circular shift register 
using the plurality of logic XOR gates, the miss count 
for each known word digraph indicating the number of 
bits, for each known word circular shift register, which 
do not correspond to bits in the unknown word circular 
shift register. 
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